The Per-Fide Corpus : A new Resource for Corpus-Based Terminology, Contrastive Linguistics and Translation Studies
نویسندگان
چکیده
The Per-Fide project is a joint collaboration between researchers at the Department of Informatics and the Institute of Arts and Humanities at the University of Minho, Portugal. The acronym Per-Fide stands for Portuguese (P) in parallel with 6 languages: English (E), Russian (R), French (F), Italian (I), German/Deutsch (D) and Spanish/ Español (E). First, we expound on the role of the Per-Fide project within the context of existing corpora that include the Portuguese language in its different variants – namely, European Portuguese, Brazilian Portuguese and Portuguese spoken in African countries (Angola, Mozambique, Guinea-Bissau, Cape Verde, Sao Tome and Principe). The idea of creating a multilingual parallel corpus project in which Portuguese assumes a pivotal role arose primarily due to the fact that the majority of online corpora that include Portuguese are either monolingual or bilingual. Furthermore, these corpora focus mainly on one specific text type. Consequently, the few multilingual parallel corpora that include Portuguese consist of a relatively small Portuguese subcorpus1 and provide limited search facilities mainly due to the fact that the Portuguese texts have not been morphologically tagged and/or syntactically annotated. Our second goal in this chapter is to provide an overview of the design criteria for the development of tools and resources in the various stages of the Per-Fide corpora construction process, focusing particularly on automation, validation, generalization and resource sharing. Here, a brief description of the workflow components involved in the preand post-alignment phases will be included. Finally, we draw attention to several practical applications of the current features of the Per-Fide corpus in translation practice and contrastive linguistic studies, focusing on the use and potential of probabilistic translation dictionaries and the role that parallel corpora can play in translating idioms.
منابع مشابه
The corpus approach: a common way forward for Contrastive Linguistics and Translation Studies?
In this introductory chapter, Granger traces the development of Contrastive Linguistics and Translation Studies over the last decades to the present day, focusing on the role of the computer corpus in giving new impetus to each field and bringing them closer together. She discusses the different types of monolingual and multilingual corpora being used in CL and TS research, proposing at the sam...
متن کاملTranslation and contrastive linguistic studies at the interface of English and Chinese: Significance and implications
Corpora have revolutionized nearly all areas of linguistic research over the past four decades (McEnery, Xiao and Tono 2006; McEnery and Hardie 2012). Translation studies and contrastive linguistics are no exceptions. Indeed, the rapid development of bilingual parallel corpora as well as monolingual and multilingual comparable corpora since the early 1990s has been of particular relevance and c...
متن کاملIntroducing the Per-Fide Project: Parallelizing Portuguese with six different Languages
In this paper we present the Per-Fide project, aimed at the construction of parallel corpora mapping the Portuguese language to six other languages English, Russian, French, Italian, German and Spanish in various domains including literary, journalistic and religious texts. First we will focus on the corpus design criteria and its main features, particularly those that distinguish this corpus f...
متن کاملCorpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملA Corpus-Based Study of zunshou and Its English Equivalents
This paper describes a corpus-based contrastive study of collocation in English and Chinese. In light of the corpus-based approach to identify functionally equivalent units, the present paper attempts to identify the collocational translation equivalents of zunshou by using a parallel corpus and two comparable corpora. This study shows that more often than not, we can find in English more than ...
متن کامل